智能论文笔记

Our team, Hibikino-Musashi@Home (the shortened name is HMA), was founded in 2010. It is based in the Kitakyushu Science and Research Park, Japan. We have participated in the RoboCup@Home Japan open competition open platform league every year since 2010. Moreover, we participated in the RoboCup 2017 Nagoya as open platform league and domestic standard platform league teams. Currently, the Hibikino-Musashi@Home team has 20 members from seven different laboratories based in the Kyushu Institute of Technology. In this paper, we introduce the activities of our team and the technologies.

translated by 谷歌翻译

经常性的神经网络传感器（RNN-T）目标在建立当今最好的自动语音识别（ASR）系统中发挥着重要作用。与连接员时间分类（CTC）目标类似，RNN-T损失使用特定规则来定义生成一组对准以形成用于全汇训练的格子。但是，如果这些规则是最佳的，则在很大程度上未知，并且会导致最佳ASR结果。在这项工作中，我们介绍了一种新的传感器目标函数，它概括了RNN-T丢失来接受标签的图形表示，从而提供灵活和有效的框架来操纵训练格子，例如用于限制对齐或研究不同的转换规则。我们证明，与标准RNN-T相比，具有CTC样格子的基于传感器的ASR实现了更好的结果，同时确保了严格的单调对齐，这将允许更好地优化解码过程。例如，所提出的CTC样换能器系统对于测试 - LibrisPeech的其他条件，实现了5.9％的字误差率，相对于基于等效的RNN-T系统的提高，对应于4.8％。

translated by 谷歌翻译

Two-step reinforcement learning for model-free redesign of nonlinear optimal regulator

Mei Minami , Yuka Masumoto , Yoshihiro Okawa , Tomotake Sasaki , Yutaka Hori

分类：机器学习

2021-03-05

在许多实际控制应用中，由于植物特征的变化，闭环系统的性能水平随着时间而变化。因此，在不经过系统建模过程的情况下，非常需要重新设计控制器，这对于闭环系统通常很难。强化学习（RL）是一种有前途的方法之一，仅基于闭环系统的测量，可以为非线性动力学系统提供最佳控制器的无模型重新设计。但是，RL的学习过程需要使用可能会在植物上累积磨损的控制系统不良的系统进行大量试验实验。为了克服这一限制，我们提出了一种无模型的两步设计方法，该方法在未知非线性系统的最佳调节器重新设计问题中提高了RL的瞬态学习性能。具体而言，我们首先设计了一种线性控制定律，该法律以无模型的方式达到一定程度的控制性能，然后通过并行使用设计的线性控制法来训练非线性最佳控制法。我们引入了一种线性控制定律设计的离线RL算法，并理论上保证了其在轻度假设下与LQR控制器的收敛性。数值模拟表明，所提出的方法可以提高RL的超参数调整中的瞬态学习性能和效率。

translated by 谷歌翻译

This paper introduces a new open source platform for end-toend speech processing named ESPnet. ESPnet mainly focuses on end-to-end automatic speech recognition (ASR), and adopts widely-used dynamic neural network toolkits, Chainer and Py-Torch, as a main deep learning engine. ESPnet also follows the Kaldi ASR toolkit style for data processing, feature extraction/format, and recipes to provide a complete setup for speech recognition and other speech processing experiments. This paper explains a major architecture of this software platform, several important functionalities, which differentiate ESPnet from other open source ASR toolkits, and experimental results with major ASR benchmarks.

translated by 谷歌翻译